Overview

This document does some initial exploration of the FAA flight delay data. Starting now with 2015 Airline Service Quality Performance (ASQP) data.

Variables in 2015 data

The data frame has 5,819,079 rows and 55 columns.

Overview of 2015 ASQP data. For factor variables, most frequent value is shown.
Variables Class N_unique Min_numeric Max_numeric Top_factor
YEAR numeric 1 2015 2015
QUARTER numeric 4 1 4
MONTH numeric 12 1 12
DAY_OF_MONTH numeric 31 1 31
DAY_OF_WEEK factor 7 4
FLIGHT_DATE Date 365
AIRLINE_ID factor 14 19393
CARRIER factor 14 WN
FLIGHT_NUM factor 6952 469
ORIGIN factor 322 ATL
ORIGIN_CITY_NAME factor 309 Chicago
ORIGIN_STATE factor 53 CA
ORIGIN_STATE_FIPS factor 53 6
ORIGIN_WAC factor 53 91
DEST factor 322 ATL
DEST_CITY_NAME factor 309 Chicago
DEST_STATE factor 53 CA
DEST_STATE_FIPS factor 53 6
DEST_WAC factor 53 91
CRS_DEP_TIME_HR numeric 24 0 23
CRS_DEP_TIME_MIN numeric 60 0 59
DEP_TIME_HR numeric 26 0 24
DEP_TIME_MIN numeric 61 0 59
DEP_DELAY numeric 1218 -82 1988
DEP_DELAY_MINS numeric 1164 0 1988
DEP_DELAY_15 numeric 3 0 1
DEP_DELAY_GRPS numeric 16 -2 12
DEP_TIME_BLK character 19
TAXI_OUT numeric 185 1 225
WHEELS_OFF numeric 1441 1 2400
WHEELS_ON numeric 1441 1 2400
TAXI_IN numeric 186 1 248
CRS_ARR_TIME_HR numeric 25 0 24
CRS_ARR_TIME_MIN numeric 60 0 59
ARR_TIME_HR numeric 26 0 24
ARR_TIME_MIN numeric 61 0 59
ARR_DELAY numeric 1241 -87 1971
ARR_DELAY_MINS numeric 1158 0 1971
ARR_DELAY_15 numeric 3 0 1
ARR_DELAY_GRPS numeric 16 -2 12
ARR_TIME_BLK character 19
CANCELLED numeric 2 0 1
CANCELLATION_CODE factor 5 B
DIVERTED numeric 2 0 1
CRS_ELAPSED_TIME numeric 551 18 718
ACTUAL_ELAPSED_TIME numeric 713 14 766
AIR_TIME numeric 676 7 690
FLIGHTS numeric 1 1 1
DISTANCE numeric 1363 21 4983
DISTANCE_GRP numeric 11 1 11
CARRIER_DELAY numeric 1068 0 1971
WEATHER_DELAY numeric 633 0 1211
NAS_DELAY numeric 571 0 1134
SECURITY_DELAY numeric 155 0 573
LATE_AIRCRAFT_DELAY numeric 696 0 1331

Notes

Questions (for us to answer after reading the ASQP documentation):

  • NULL values for WHEELS_OFF or WHEELS_ON indicate no value in input data. What reasons might lead to missing values?
  • CRS time is the Computerized Reservation System. Should we take CRS time as the scheduled time that a consumer would see ‘at booking’?
  • What is the WAC code? Appears to be the same as the destination state.
  • What is the GRPS code?
  • Is distance in miles?

Additional data needs:

  • Simple table of lat / long for each airport
  • Time zone for departure and arrival airports. We will want this so that we can do our own calculations of time differences; we can generate this given lat longs of each airport if needed.

Visualization

Counts of flights by month, by carrier show some variation. AA in particular merged with US Airways in 2015, so the counts of flights for AA and US reflect that merger. (Airline codes here; AA/US merger)

Count of flights by month of 2015, by carrier
1 2 3 4 5 6 7 8 9 10 11 12
AA 44,059 39,835 45,966 44,770 44,710 44,360 81,434 79,748 73,379 77,290 73,871 76,562
AS 13,257 12,194 14,276 13,974 14,682 15,075 15,821 16,095 14,271 14,467 13,950 14,459
B6 21,623 19,751 22,590 22,020 22,565 22,558 24,029 23,826 21,133 21,913 21,697 23,343
DL 64,421 60,884 74,166 72,170 74,815 77,255 80,741 80,947 72,063 75,552 72,228 70,639
EV 49,925 45,138 54,190 49,296 49,213 49,119 50,381 48,554 43,721 45,728 42,572 44,140
F9 6,829 5,809 6,950 7,148 8,118 7,893 8,090 8,142 7,873 8,101 7,763 8,120
HA 6,440 5,779 6,313 6,093 6,434 6,677 6,955 6,901 6,154 6,242 6,024 6,260
MQ 29,900 26,940 28,146 25,695 25,431 25,407 24,750 23,881 21,202 21,982 20,305 20,993
NK 8,743 8,089 9,400 9,496 10,051 9,826 10,351 10,432 9,948 10,208 10,164 10,671
OO 48,114 43,989 50,078 49,329 49,864 50,307 52,627 52,730 47,625 48,808 47,292 47,590
UA 38,395 36,235 43,603 41,342 44,411 46,084 46,478 45,413 41,778 45,894 42,647 43,443
US 33,489 30,153 34,516 32,496 33,761 34,300 0 0 0 0 0 0
VX 4,731 4,223 4,873 4,915 5,236 5,260 5,411 5,688 5,154 5,464 5,414 5,534
WN 100,042 90,172 109,245 106,407 107,702 109,776 113,650 108,179 100,645 104,516 104,045 107,476